Search CORE

44 research outputs found

Heather Suzanne Woods, Leslie A. Hahner: Make America Meme Again: The Rhetoric of the Alt-Right

Author: Vetter Lara Sophie
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2019
Field of study

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

Static Graphs for Coding Productivity in OpenACC

Author: Peña Antonio
Toledo Leonel
Valero Lara Pedro
Vetter Jeffrey
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

The main contribution of this work is to increase the coding productivity for GPU programming by using the concept of Static Graphs. To do so, we have combined the new CUDA Graph API with the OpenACC programming model. We use as test cases a well-known and widely used problems in HPC and AI: the Particle Swarm Optimization. We complement the OpenACC functionality with the use of CUDA Graph, achieving accelerations of more than one order of magnitude, and a performance very close to a reference and optimized CUDA code. Finally, we propose a new specification to incorporate the concept of Static Graphs into the OpenACC specification.This project has received funding from the EPEEC project from the European Union’s Horizon 2020 Research and Innovation program under grant agreement No. 801051.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Cross-lingual searching and visualization for greek and latin and old norse texts

Author: Heesch Daniel
Rydberg-Cox Jeffrey A.
Rüger Stefan
Vetter Lara
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

We explore approaches to multi--lingual information retrieval for Greek, Latin, and Old Norse texts and an innovative visualization facility for the results

Crossref

Open Research Online (The Open University)

Towards enhancing coding productivity for GPU programming using static graphs

Author: Peña Antonio
Toledo Leonel
Valero Lara Pedro
Vetter Jeffrey S.
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

The main contribution of this work is to increase the coding productivity of GPU programming by using the concept of Static Graphs. GPU capabilities have been increasing significantly in terms of performance and memory capacity. However, there are still some problems in terms of scalability and limitations to the amount of work that a GPU can perform at a time. To minimize the overhead associated with the launch of GPU kernels, as well as to maximize the use of GPU capacity, we have combined the new CUDA Graph API with the CUDA programming model (including CUDA math libraries) and the OpenACC programming model. We use as test cases two different, well-known and widely used problems in HPC and AI: the Conjugate Gradient method and the Particle Swarm Optimization. In the first test case (Conjugate Gradient) we focus on the integration of Static Graphs with CUDA. In this case, we are able to significantly outperform the NVIDIA reference code, reaching an acceleration of up to 11× thanks to a better implementation, which can benefit from the new CUDA Graph capabilities. In the second test case (Particle Swarm Optimization), we complement the OpenACC functionality with the use of CUDA Graph, achieving again accelerations of up to one order of magnitude, with average speedups ranging from 2× to 4×, and performance very close to a reference and optimized CUDA code. Our main target is to achieve a higher coding productivity model for GPU programming by using Static Graphs, which provides, in a very transparent way, a better exploitation of the GPU capacity. The combination of using Static Graphs with two of the current most important GPU programming models (CUDA and OpenACC) is able to reduce considerably the execution time w.r.t. the use of CUDA and OpenACC only, achieving accelerations of up to more than one order of magnitude. Finally, we propose an interface to incorporate the concept of Static Graphs into the OpenACC Specifications.his research was funded by EPEEC project from the European Union’s Horizon 2020 Research and Innovation program under grant agreement No. 801051. This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan, accessed on 13 April 2022).Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Directory of Open Access Journals

Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation

Author: Balaprakash Prasanna
Godoy William F.
Teranishi Keita
Valero-Lara Pedro
Vetter Jeffrey S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/06/2023
Field of study

We evaluate AI-assisted generative capabilities on fundamental numerical kernels in high-performance computing (HPC), including AXPY, GEMV, GEMM, SpMV, Jacobi Stencil, and CG. We test the generated kernel codes for a variety of language-supported programming models, including (1) C++ (e.g., OpenMP [including offload], OpenACC, Kokkos, SyCL, CUDA, and HIP), (2) Fortran (e.g., OpenMP [including offload] and OpenACC), (3) Python (e.g., numba, Numba, cuPy, and pyCUDA), and (4) Julia (e.g., Threads, CUDA.jl, AMDGPU.jl, and KernelAbstractions.jl). We use the GitHub Copilot capabilities powered by OpenAI Codex available in Visual Studio Code as of April 2023 to generate a vast amount of implementations given simple + + prompt variants. To quantify and compare the results, we propose a proficiency metric around the initial 10 suggestions given for each prompt. Results suggest that the OpenAI Codex outputs for C++ correlate with the adoption and maturity of programming models. For example, OpenMP and CUDA score really high, whereas HIP is still lacking. We found that prompts from either a targeted language such as Fortran or the more general-purpose Python can benefit from adding code keywords, while Julia prompts perform acceptably well for its mature programming models (e.g., Threads and CUDA.jl). We expect for these benchmarks to provide a point of reference for each programming model's community. Overall, understanding the convergence of large language models, AI, and HPC is crucial due to its rapidly evolving nature and how it is redefining human-computer interactions.Comment: Accepted at the Sixteenth International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), 2023 to be held in conjunction with ICPP 2023: The 52nd International Conference on Parallel Processing. 10 pages, 6 figures, 5 table

arXiv.org e-Print Archive

Process analytical technology as key-enabler for digital twins in continuous biomanufacturing

Author: Helgers Heribert
Juckers Alex
Lohmann Lara Julia
Mouellef Mourad
Schmidt Axel
Strube Jochen
Vetter Florian
Zobel-Roos Steffen
Publication venue: 'Wiley'
Publication date: 06/12/2021
Field of study

Over the last few years rapid progress has been made in adopting well-known process modeling techniques from chemicals to biologics manufacturing. The main challenge has been analytical methods as engineers need quantitative data for their workflow. Industrialization 4.0, Internet of Things, artificial intelligence and machine learning activities up to big data analysis have taken their share in solving fundamental problems like component- or at least group-specific evaluation of spectroscopic data. Besides, concerning inline analytics methods included in process analytical technology concepts the key technology has been the generation of decisive validated digital twins based on process models. This review aims to summarize the methodology to achieve a holistic understanding of process models, control and optimization by means of digital twins using the example of recent work published in this field

Publikationsserver der Technischen Universität Clausthal

Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

Author: Balaprakash Prasanna
Godoy William F.
Huante Alexis
Lail Mustafa Al
Teranishi Keita
Valero-Lara Pedro
Vetter Jeffrey S.
Publication venue
Publication date: 11/09/2023
Field of study

We evaluate the use of the open-source Llama-2 model for generating well-known, high-performance computing kernels (e.g., AXPY, GEMV, GEMM) on different parallel programming models and languages (e.g., C++: OpenMP, OpenMP Offload, OpenACC, CUDA, HIP; Fortran: OpenMP, OpenMP Offload, OpenACC; Python: numpy, Numba, pyCUDA, cuPy; and Julia: Threads, CUDA.jl, AMDGPU.jl). We built upon our previous work that is based on the OpenAI Codex, which is a descendant of GPT-3, to generate similar kernels with simple prompts via GitHub Copilot. Our goal is to compare the accuracy of Llama-2 and our original GPT-3 baseline by using a similar metric. Llama-2 has a simplified model that shows competitive or even superior accuracy. We also report on the differences between these foundational large language models as generative AI continues to redefine human-computer interactions. Overall, Copilot generates codes that are more reliable but less optimized, whereas codes generated by Llama-2 are less reliable but more optimized when correct.Comment: Accepted at LCPC 2023, The 36th International Workshop on Languages and Compilers for Parallel Computing http://www.lcpcworkshop.org/LCPC23/ . 13 pages, 5 figures, 1 tabl

arXiv.org e-Print Archive

Julia as a unifying end-to-end workflow language on the Frontier exascale system

Author: Anderson Caira
da Silva Rafael Ferreira
Gainaru Ana
Godoy William F.
Lee Katrina W.
Valero-Lara Pedro
Vetter Jeffrey S.
Publication venue
Publication date: 27/09/2023
Field of study

We evaluate Julia as a single language and ecosystem paradigm powered by LLVM to develop workflow components for high-performance computing. We run a Gray-Scott, 2-variable diffusion-reaction application using a memory-bound, 7-point stencil kernel on Frontier, the US Department of Energy's first exascale supercomputer. We evaluate the performance, scaling, and trade-offs of (i) the computational kernel on AMD's MI250x GPUs, (ii) weak scaling up to 4,096 MPI processes/GPUs or 512 nodes, (iii) parallel I/O writes using the ADIOS2 library bindings, and (iv) Jupyter Notebooks for interactive analysis. Results suggest that although Julia generates a reasonable LLVM-IR, a nearly 50% performance difference exists vs. native AMD HIP stencil codes when running on the GPUs. As expected, we observed near-zero overhead when using MPI and parallel I/O bindings for system-wide installed implementations. Consequently, Julia emerges as a compelling high-performance and high-productivity workflow composition language, as measured on the fastest supercomputer in the world.Comment: 11 pages, 8 figures, accepted at the 18th Workshop on Workflows in Support of Large-Scale Science (WORKS23), IEEE/ACM The International Conference for High Performance Computing, Networking, Storage, and Analysis, SC2

arXiv.org e-Print Archive

A small-molecule inhibitor of the NLRP3 inflammasome for the treatment of inflammatory diseases

Author: Butler Mark S.
Chae Jae Jin
Coll Rebecca C.
Cooper Matthew A.
Croker Daniel E.
Dungan Lara S.
Haneklaus Moritz
Higgins Sarah C.
Inserra Marco C.
Kästner Daniel L.
Latz Eicke
Masters Seth L.
Mills Kingston H. G.
Monks Brian G.
Muñoz-Planillo Raúl
Núñez Gabriel
O'Neill Luke A. J.
Robertson Avril A. B.
Schroder Kate
Stütz Andrea
Sutton Caroline E.
Vetter Irina
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2015
Field of study

The NOD-like receptor (NLR) family, pyrin domain-containing protein 3 (NLRP3) inflammasome is a component of the inflammatory process, and its aberrant activation is pathogenic in inherited disorders such as cryopyrin-associated periodic syndrome (CAPS) and complex diseases such as multiple sclerosis, type 2 diabetes, Alzheimer's disease and atherosclerosis. We describe the development of MCC950, a potent, selective, small-molecule inhibitor of NLRP3. MCC950 blocked canonical and noncanonical NLRP3 activation at nanomolar concentrations. MCC950 specifically inhibited activation of NLRP3 but not the AIM2, NLRC4 or NLRP1 inflammasomes. MCC950 reduced interleukin-1 beta (IL-1 beta) production in vivo and attenuated the severity of experimental autoimmune encephalomyelitis (EAE), a disease model of multiple sclerosis. Furthermore, MCC950 treatment rescued neonatal lethality in a mouse model of CAPS and was active in ex vivo samples from individuals with Muckle-Wells syndrome. MCC950 is thus a potential therapeutic for NLRP3-associated syndromes, including autoinflammatory and autoimmune diseases, and a tool for further study of the NLRP3 inflammasome in human health and disease

University of Queensland eSpace

Robustness and management adaptability in tropical rangelands: a viability-based assessment under the non-equilibrium paradigm

Author: Accatino
Accatino
Alvarez
Anderies
Anderies
Anderies
Aubin
Bahre
Bashari
Behnke
Beisner
Belsky
Berkes
Bestelmeyer
Bond
Briske
C. De Michele
Calabrese
Campbell
Carande
Carpenter
Clements
D. Ward
De Goede
De Lara
De Michele
Deshmukh
Dyksterhuis
Ellis
F. Accatino
Fynn
Galoppín
Gross
Gunderson
Heitschmidt
Higgins
Holling
Illius
Janssen
Janssen
K. Wiegand
K.M. Meyer
Knapp
Laycock
Ludwig
Lund
Martin
Martin
Martinet
May
Meuret
Meuret
Papachristou
Pickup
Provenza
Provenza
Pérez-Barberìa
R. Sabatier
Rietkerk
Rougé
Sabatier
Scheffer
Schröder
Smit
Thornton
Tichit
Tietjen
Tompkins
Trollope
Van Auken
Van Coller
Vetter
Walker
Walker
Ward
Ward
Ward
Westoby
Wiegand
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2014
Field of study

Crossref